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Abstract: The k- set agreement problem is a generalization of the consensus problem. Namely, assuming each process proposes a 
value, each non-faulty process has to decide a value such that each decided value was proposed, and no more than k different values 
are decided. This is a hard problem in the sense that it cannot be solved in asynchronous systems as soon as k or more processes may 

,_prash. One way to circumvent this impossibility consists in weakening its termination property, requiring that a process terminates 

u (decides) only if it executes alone during a long enough period. This is the well-known obstruction-freedom progress condition. 
f "\ Considering a system of n anonymous asynchronous processes, which communicate through atomic read/write registers only, 
and where any number of processes may crash, this paper addresses and solves the challenging open problem of designing an 
obstruction-free fc-set agreement algorithm with [n — k + 1) atomic registers only. From a shared memory cost point of view, this 

1 -‘algorithm is the best algorithm known so far, thereby establishing a new upper bound on the number of registers needed to solve the 

problem (its gain is (n — k ) with respect to the previous upper bound). The algorithm is then extended to address the repeated version 
of (n, k)- set agreement. As it is optimal in the number of atomic read/write registers, this algorithm closes the gap on previously 
- established lower/upper bounds for both the anonymous and non-anonymous versions of the repeated ( n , fc)-set agreement problem, 
t'"' Finally, for 1 < x < k < n, a generalization suited to rc-obstruction-freedom is also described, which requires (n k + x) atomic 
registers only. 

Key-words: Anonymous processes. Asynchronous system. Atomic read/write register. Bounded number of registers. Consensus, 
Distributed algorithm. Distributed computability. Fault-tolerance, fc-Set agreement. Obstruction-freedom, Process crash. Repeated 
fc-set agreement. Upper bound. 

i n _ 


> Accord k-ensembliste asynchrone et anonyme avec (n — k + 1) registres atomiques 

Resume : Cet article presente un algorithm asynchrone qui resoud Vaccord k-ensenbliste dans un systeme de n processus 
i asynchrones et anonymes communiquant via (n — k +1) registres atomiques du type lire/ecrire, et dans lequel un nombre quelconque 




d’entre euxpent s’arrer defgon inopinee (crash failure). La propriete de vivacite garantiepar 1’algorithm est appelee “obstruction- 
freedom ”. 

Mots cles : Accord k-ensembliste. Borne de complexite. Consensus, systeme asynchrone, systeme anonyme, registres atomiques 
read/write, crash de processus, calcul distribue, tolerance auxfautes. 
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1 Introduction 

A first challenge: cope with multi-writer atomic registers Pioneering works (such as | 2T1 1 25]) have shown that processes 
have to cope not only with finite asynchrony (finite but arbitrary process speed) but also with infinite asynchrony (process crash 
failures), a context in which mutex-based synchronization mechanisms become useless. This approach has promoted the design of 
concurrent algorithms as a central topic of fault-tolerant distributed computing. See for example Herlihy’s seminal paper fl6l . or 
recent textbooks such as Ifl9l[26ll29ll . 

When processes may communicate with Single-Writer Multi-Reader (SWMR) atomic registers, a concurrent algorithm usually 
associates an SWMR register with each process. This type of registers allows any process to give information to all the other 
processes by writing in its own register, and obtain information from them by reading their SWMR registers. The classical snapshot 
algorithm introduced in [I] is a well-known example of use of such atomic registers. 

When processes communicate with Multi-Writer Multi-Reader (MWMR) atomic registers, the situation is different. As any 
process can write any register, the previous association is no longer given for free. An approach to cope with such registers consists 
in emulating SWMR registers on top of MWMR registers, and then benefit from existing SWMR-based algorithms. It is shown 
in 00 that, in a system of n processes, (a) (2 n — 1) MWMR atomic registers are needed to “wait-free” simulate one SWMR 
atomic register, and (b) only n MWMR atomic registers are needed if the simulation is required to be only “non-blocking’Q 

This simulation approach becomes irrelevant if the underlying system provides the n processes with less than n atomic MWMR 
registers. So, we focus here on what we name genuine concurrent algorithms, where “genuine” means “without simulating SWMR 
registers on top of MWMR registers”. An important question is then “Given a problem, how many MWMR atomic registers are 
needed to solve it with a genuine algorithm?” Unfortunately, as stressed in [fTj], the design of genuine algorithms based on MWMR 
atomic registers is still in its infancy, and sometimes resembles “black art” in the sense that their underlying intuition is difficult to 
capture and formulate. 

A second challenge: cope with anonymous processes In some algorithms based on MWMR atomic registers, a process is 
required to write a pair made up of the data value it wants to write, plus control values, those including its identity. This is for 
example the case of snapshot algorithms based on MWMR atomic registers 1261 . 

So, a second question that comes to mind is: “Is it possible to solve a given problem with MWMR atomic registers and anonymous 
processes; moreover, if the answer is “yes”, how many registers are needed?” To be more precise, let us recall that, in an anonymous 
system, processes have no identity, have the same code, and the same initialization of their local variables. It is common to remind 
that, due to privacy motivations, anonymous systems are becoming more and more important. 

Consensus and /.'-set agreement The paper considers the fc-set agreement problem in a system of n processes. This problem, in¬ 
troduced in J5), and denoted (n, k)- set agreement in the following, is a generalization of consensus, which corresponds to the instance 
where k = 1. Assuming each participating process proposes a value, each non-faulty process must decide a value (termination), 
which was proposed by some process (validity), and at most k different values can be decided (agreement). 

Impossibility results and the obstruction-freedom progress condition It is well-known that it is impossible to design a determin¬ 
istic wait-free consensus algorithm in asynchronous systems prone to even a single crash failure, be the underlying communication 
medium an asynchronous send/receive network lfl2ll . or a set of read/write atomic registers (23). It is also shown in 0|T8][27:| that, 
if k or more processes may crash, there is no deterministic wait-free read/write algorithm that can solve (n, A:)-set agreement. 

As we are interested in the computing power of pure read/write asynchronous systems, we want to neither enrich the underlying 
system with additional power such as synchrony assumptions, random numbers, or failure detectors, nor impose constraints restrict¬ 
ing the input vector collectively proposed by the processes. So, we consider here a progress condition weaker than wait-freedom, 
named obstruction-freedom 02). In the consensus or (n, A:)-set agreement context, obstruction-freedom requires a process to decide 
a value only if it executes solo during a “long enough period” (which means that, during this period, it is not bothered by other 
processes). An in-depth study of complexity issues of obstruction-free algorithms is presented in 0. 

Several obstruction-free consensus algorithms suited to non-anonymous systems have been proposed (e.g., mm to cite a few). 
When considering anonymous systems, the obstruction-free algorithm presented in ED requires (8 n + 2) MWMR atomic registers 
to solve consensus, and the obstruction-free algorithms described in (7)[9) solve (n, k)- set agreement with 2(n — k) + 1 underlying 
MWMR atomic registers. 

1 “Wait-free” means that any read or write invocation on the SWMR register that is built must terminate if the invoking process does not crash GD “Non-blocking” 
means that at least one process that does not crash returns from all its read and write invocations ED. 
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Motivation and content of the paper This paper presents a genuine obstruction-free algorithm solving the (n, fc)-set agreement 
problem in an asynchronous anonymous read/write system where any number of processes may crash. This algorithm (called base 
algorithm in the following) requires (n — k + 1) MWMR atomic registers (i.e., exactly n registers when one is interested in the 
consensus problem). 

It is shown in iTToTl that {i(-ffi) MWMR atomic registers is a lower bound for obstruction-free consensus. This lower bound 
has recently been generalized to SI {y/f — 2) for (n. k )-set agreement in anonymous systems (9). On another hand, and as already 
pointed out, the best obstruction-free (n. k )-set agreement algorithm known so far requires 2 (n — k) + 1 MWMR registers (71 [9|. 
Hence, the base algorithm proposed in this paper provides us with a gain of 2 (n — k) + 1 — (n — k + 1) = (n — k) MWMR atomic 
registers. 

In the repeated version of the ( n , fc)-set agreement problem, the processes participate in a sequence of (n, k )-set agreement 
instances. It is shown in 0 that (n — k + 1) atomic registers are necessary to solve repeated (n, k )-set agreement, be the system 
anonymous or non-anonymous. The present paper shows that a simple modification of the base obstruction-free (n, k )-set agreement 
algorithm solves the repeated (n , fc)-set agreement problem without requiring additional atomic registers. It follows that, as this 
algorithm requires (n — k + 1) atomic registers, it is optimal, which closes the gap on previous proposed upper bounds for the 
repeated (n, k)-set agreement problem. 

To attain its goal, the proposed base algorithm, which is round-based, follows the execution pattern “snapshot; local computation; 
write”, where the snapshot and the write are on the (n — k + 1) MWMR atomic registers. This pattern is reminiscent of the one 
called “look; compute; move” introduced in [:13]|28| in the context of robot algorithms. Interestingly, no process needs to maintain 
local information between successive rounds. In this sense, the algorithm is locally memoryless. 

From a more technical point of view, each atomic register contains a quadruplet consisting of a round number, two control bits, 
and a proposed value (whose size depends only on the application). The algorithm exploits a partial order on the quadruplets that are 
written into MWMR atomic registers. The way each process computes new quadruplets is the key of the algorithm. (The extended 
version for repeated ( n , fc)-set agreement, requires sixuplets.) 

Roadmap The paper is composed of [8] sections. Section [2] presents the computing model and definitions used in the paper. The 
presentation is done incrementally. First, Section [3]presents the base obstruction-free algorithm solving consensus. This algorithm 
captures the essence of the solution. It is proved correct in Section [4] Then, Section [5] extends this base algorithm to obtain an 
anonymous obstruction-free algorithm solving (n, k )-set agreement, and Section [6] addresses the case where (n. k )-set agreement is 
used repeatedly. Section [7] extends the base algorithm to the ^-obstruction-freedom progress condition (only (n /.: + x) registers 
are then required by the algorithm). Finally, Section[8]concludes the paper. 


2 Computation Model and Obstruction-free Consensus 

2.1 Computing Model 

Process model The system is composed of n asynchronous processes, denoted p\, ..., p n . When considering a process p ,, the 
integer i is called its index. Indexes are used to facilitate the exposition from an external observer point of view. Processes do not 
have identities and have the very same code. We assume that they know the value n. 

Up to (n — 1) processes may crash. A crash is an unexpected halting. After it has crashed (if it ever does), a process remains 
crashed forever. From a terminology point of view, and given an execution, a faulty process is a process that crashes, and a correct 
process is a process that does not eras 10 

Let T denote the increasing sequence of time instants (observable only from an external point of view). At each instant, a unique 
process is activated to execute a step. A step consists in a write or a read of an atomic register (access to the shared memory) possibly 
followed by a finite number of internal operations (on the local variables of the process that issued the operation). 

Communication model In addition to processes, the computing model includes a communication medium made up of ro atomic 
multi-writer/multi-reader (MWMR) atomic register0 the value of m depends on the problem we want to solve. These registers are 
encapsulated in an array denoted REG[l..m\. 

“ Atomic ” means that the read and write operations on a register REG[x\, 1 < x < m, appear as if they have been executed 
sequentially, and this sequence (a) respects the real-time order of non-concurrent operations, and (b) is such that each read returns 
the value written by the closest preceding write operation (22]. When considering any concurrent object defined from a sequential 

2 No process knows if it is correct or faulty. This is because, before crashing, a faulty process behaves as a correct process. 

3 Let us notice that the anonymity assumption prevents processes from using single-writer/multi-reader registers. 
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specification, atomicity is called linearizability li20l . More generally, the sequence of operations is called a linearization , and the 
time instant at which an operation appears as being executed is called its linearization point. 

From atomic registers to a snapshot object At the upper layer (where consensus or (n, k)- set agreement is solved), the array 
REG[l..m] is used to define a snapshot object (T). This object, denoted REG , provides the processes with two operations denoted 
write() and snapshot(). 

When a process invokes REG.wr\te(x,v) it deposits the value v in REG[x]. When it invokes REG. snapshotQ it obtains the 
value of the whole array. The snapshot object is atomic (see above), which means that each invocation of REG. snapshot() appears 
as if it executed instantaneously. Hence, at this observation level, a linearization is a sequence of write and snapshot operations. 

An anonymous non-blocking (hence obstruction-free) implementation of a snapshot object is described in 031 (for completeness 
this algorithm is presented in Appendix [A). This implementation does not require additional atomic registers. In the following we 
consider that this snapshot abstraction is supplied by this underlying layer. 

2.2 Obstruction-free consensus and obstruction-free (n, fc)-set agreement 

Obstruction-free consensus An obstruction-free consensus object is a one-shot object that provides each process with a single 
operation denoted propose(). This operation takes a value as input parameter and returns a value. 

“ One-shot ” means that a process invokes propose() at most once. When a process invokes propose(u), we say that it “proposes 
v”. When the invocation of propose() returns value v, we say that the invoking process “decides v”. A process executes “solo” when 
it keeps on executing while the other processes have stopped their execution (at any point of their algorithm). The obstruction-free 
consensus problem is defined by the following properties (that is, to be correct, any obstruction-free algorithm must satisfy these 
properties). 

• Validity. If a process decides a value v, this value was proposed by a process. 

• Agreement. No two processes decide different values. 

• OB-termination. If there is a time after which a process executes solo, it decides a value. 

• SV-terminatioij^| If a single value is proposed, all correct processes decide. 

Validity relates outputs to inputs. Agreement relates the outputs. Termination states the conditions under which a correct process 
must decide. There are two cases. The first is related to obstruction-freedom. The second one is independent of the concurrency and 
failure pattern; it is related to the input value pattern. 

Obstruction-free (n, k)- set agreement An obstruction-free (n. k)- set agreement object is a one-shot object which has the same 
validity, OB-termination, and S V-termination properties as consensus, and where the agreement property is: 

• Agreement. At most k different values are decided. 

As for consensus, SV-termination property is a new property strengthening the classical definition of fc-set agreement stated in 0. 

3 Obstruction-free Anonymous Consensus Algorithm 

The algorithm is described in Figure[2] As indicated in the Introduction, its essence is captured by the quadruplets that can be written 
in the MWMR atomic registers. 

Shared memory The shared memory is made up of a snapshot object REG, composed of m = n MWMR atomic registers. Each 
of them contains a quadruplet initialized to (0, down, false, _L). The meaning of these fields is the following. 

• The first field, denoted rd, is a round number. 

• The second field, denoted tvt (level), has a value in {up, down}, where up > down. 

• The third field, denoted eft (conflict), is a Boolean (init to false). We assume true > false. 

• The last field, denoted val, is initialized to _L, and then contains always a proposed value. It is assumed that the set of proposed 
values is totally ordered, and the default value _L is smaller than any of them. 

When considering lexicographical ordering, it is easy to see that all possible quadruplets (rd, tvt, eft, vat) are totally ordered. This 
total order, and its reflexive version, are denoted "<” and “<”, respectively. 

4 This termination property, which relates termination to the input values, is not part of the classical definition of the obstruction-free consensus problem. It is an 
additional requirement which demands termination under specific circumstances that are independent of the concurrency pattern. 
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function sup(T) is % T is a set of quadruplets % 

(51) let (r, level, —, v) be max(T); % lexicographical order % 

(52) let vals(T) be {w | 3(r, —,—,w) £ T}; 

(53) let conf lictl(T) be 3 (r, —, true, —) £ T; % conflict inherited % 

(54) let conflict2(T) be \vals(T)\ > 1: % conflict discovered % 

(55) let conf lict(T) be conf lictl(T) V conf Cict2(T); 

(56) return ((r, level, conflict(T),v)). 


Figure 1: The function sup() 


The notion of a conflict and the function sup() The function sup(), defined in Figure[l] plays a central role in the obstruction-free 
(n, fcj-agreement algorithm. It takes a non-empty set of quadruplets T as input parameter, and returns a quadruplet, which is the 
supremum of T, defined as follows. 

Let (r, level, —, v ) be the maximal element of T according to lexicographical ordering (line SI), and valsiT) the values in the 
quadruplets of T associated with the maximal round number r (line S2). The set T is conflicting if one of the two following cases 
occurs (line S5). 

• There is a quadruplet X = (r, —, true, —) in T (line S3). In this case, there is a quadruplet X G T whose round number is 
the highest ( X.rd = r), and whose conflict field X.lvl = true. We then say that the conflict is “inherited”. 

• There are at least two quadruplets X and Y in T, that have the highest round number in T (i.e., X.rd = Y.rd = r), and 
contain different values (i.e., X.val f Y.val ) (lines S2 and S4). In this case we say say that the conflict is “discovered”. 

The function sup(T) first checks if T is conflicting (lines S2-S5). Then it returns at line S6 the quadruplet (r, level, conflict(T), v), 
where conf lictiT) indicates if the input set T is conflicting (line S5). Let us notice that, since true > false, the quadruplet re¬ 
turned by sup(T) is always greater than, or equal to, the greatest element in T, i.e., sup(T) > max(T). 


operation propose(ui) is 
(01) repeat forever 

(02) view <— REG. snapshot(); 

(03) case (fix : view\x] = (r, up, false, vat) where r > 0) then retu rn(vat) 

(04) (fix : view[x] = (r, down, false, vat) where r > 0) then REG. write(l, (r + 1, up, false, vat)) 

(05) (Vie : view[x] = (r, level, true, vat) where r > 0) then REG.wr\te(l, (r + 1, down, false, vat)); 

(06) otherwise let (r, level, eft, vat) <— sup(rneto[l], • • • , view[n], (1, down, false, vf); 

(07) x <— smallest index such that view[x] f (r, level, cfl, val ); 

(08) REG. writefx, (r, level, eft, val)) 

(09) end case 
(10) end repeat. 


Figure 2: Anonymous obstruction-free Consensus 


The algorithm The algorithm is pretty simple. It consists in an appropriate management of the snapshot object REG, so that the 
n quadruplets it contains (a) never allow validity and agreement to be violated, and (b) eventually allow termination under good 
circumstances (which occur when obstruction-freedom is satisfied or when a single value is proposed). 

When a process pi invokes proposes)it enters a loop that it will exit at line [03](if it terminates), by executing the statement 
return(i>a£), where val is the value it decides. 

After entering the loop a process issues first a snapshot, and assigns the returned array to its local variable view[l..n] (line|oT]>. 
Then, there are two main cases according to the value of view. 


Case 1 (lines 03fl05 1 . All entries of viewi contain the same quadruplet (r, level, conflictval) , and r > 0. 
There are three sub-cases. 


- Case 1.1. If the level is up and the conflict is false, the invoking process decides the value val (line[03|. 

- Case 1.2. If the level is down and the conflict field is false, the invoking process decides the value val (line [03]>- is 
false, process pi enters the next round by writing (r + 1, up, false, val) in the first entry of REG (line [04]>. 

- Case 1.3. If there is a conflict, pi enters the next round by writing (r + 1, down, false, val) in the first entry of REG 
(line[05|. 


Case 2 (lines [Q6|08 >. Not all entries of viewi are equal or one of them contains (0, —, —, — 

In this case, process pi calls the internal function sup(ii?'ew[l], • • • , view[n], (1, down, false, vf)) (line [06]>, which returns a 
quadruplet A' that is greater than all the input quadruplets or equal to the greatest of them. As we have seen, this quadruplet 
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A may inherit or discover a conflict. Moreover, as (1, down, false, vf) is an input parameter of the function sup(), X.val 
cannot be _L 


Let us notice that, as none of the predicates of lines 03fl05 is satisfied, not all entries of view[l..n] can be equal to the previous 
quadruplet X. The invoking process pi writes X into REG[x\, where, from its point of view, x is the first entry of REG 
whose content is different from X (lines|07||08)l. 


The underlying operational intuition To understand the intuition that underlies the algorithm, let us first consider the very simple 
case where a single process pi executes the algorithm. It obtains from its first invocation of /i/fG'.snapshotf) (line|02]i a view view in 
which all elements are equal to (0, down, false, _L). Hence, p, executes line 06 where the invocation of sup() returns the quadruplet 
(1, down, false, Vi), which is written into REG\ 1] at line [08] Then, during the second round, p, computes a quadruplet with the 
help of the function sup(), which returns (1, down, false, vf), and writes this quadruplet into REG[ 2]; etc., until p t has written 
(1, down, false, Vi) in all the atomic registers of REG\l..n\. When this has been done, p, obtains at line 02 a view all elements 
of which are equal to (1, down, f alse, vf). It consequently executes line 04 and writes (2, up, false, Vi) in REG[ 1]. Then, during 
the following executions of the loop body, it writes (2, up, false, Vi ) in the other registers of REG (line 08 i. When this is done. 
Pi obtains a snapshot containing only the quadruplet (2, up, false, Vi). When this occurs, p, is directed to execute line 03 where it 
decides. 

Let us now consider the case where, while pi is executing, another process Pj invokes propose^ ) with Vj = Uj. It is easy to 
see that pi and p :] collaborate then to fill in REG with the same quadruplet (2, up, u,). If Vj / v t , depending on the concurrency 
pattern, a conflict may occur. For instance, it occurs if REG contains both (1, down, false, vf) and (1, down, false, vf). If a 
conflict appears, it will be propagated from round to round, until a process executes alone a higher round number. 


Remark 1 Let us notice that no process needs to memorize in its local memory values that will be used in the next round. Not 
only the processes are anonymous, but their code is memoryless (no persistent variables). The snapshot object REG constitutes the 
whole memory of the system. Hence, as defined in the Introduction, the algorithm is locally memoryless. In this sense, and from a 
locality point of view, it has a “functional” flavor. 

Remark 2 Let us consider the n-bounded concurrency model l2l[24l. This model is made up of an arbitrary number of processes, 
but, at any time, there are at most n processes executing steps. This allows processes to leave the system and other processes to join 
it as long as the concurrency degree does not exceed n. 

The previous algorithm works without modification in such a model. A proposed value is now a value proposed by any of the N 
processes that participate in the algorithm. Hence, if If N > n, the number of proposed values can be greater than the upper bound n 
on the concurrency degree. This versatility dimension of the algorithm is a direct consequence of the previous “locally memoryless” 
property. 


4 Proof of the Algorithm 

shows that the relation defined on quadmplets is a partial order. This 
relation is central to prove properties of the algorithm. Such properties are stated and proved in Sections [4. 3| and 
previous properties, Section[43]establishes the correctness of our algorithm. 

4.1 Definitions and notations 

Let £ be a set of quadmplets that can be written in REG. Given A' £ £, its four fields are denoted X.rd , X .tv l, X.cfi and X.val, 
respectively, and > and > refer to the classical lexicographical order on £. Moreover, where appropriate, an array view\l..n] is 
considered as the set {view[ 1], • • • , view[n]}. 

Definition 1 let A, Y G £. 

X □ Y d = (A > Y) A [(X.rd > Y.rd ) V (X.cfi) V (XY.cfl A X.val = Y.val)]. 

At the operational level the algorithm ensures that the quadmplets it generates are totally ordered by the relation >. Differently, 
the relation □ (which is a partial order on these quadmplets, see Section |4~2| i captures the relevant part of of this total order, and is 
consequently the key cornerstone on which relies the proof of our algorithm. 

When X □ Y, we say “X strictly dominates Y”. X dominates Y, denoted X □ Y. if (A □ Y) or (A = Y) holds. The relations 
C and C are defined in the natural way. 


14.41 Based on these 


After a few definitions provided in Section|4~T| Section 4.2 
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Definition 2 Given a set of quadruplets T, we shall say that T is homogeneous when it contains a single element, say X. We then 
write it “T is 'H(X)”. 

Notation 1 The value, at time t, of the local variable xxx of a process pi is denoted xxxj. Similarly the value of an atomic register 
REG[x] at time r is denoted REG T [ x\, and the value of REG at time r is denoted REG T . 

Notation 2 Let W(x, X) denote the writing of a quadruplet X in the register REG[x\. 


Definition 3 We say “a process Pj covers REG[x\ at time r” when its next non-local step after time r is W(x, X), where X is the 
quadruplet which is written. In this case we also say “W(x, X) covers REG[x] at time t” or “REG[x\ is covered by W(x, X) at 
time t ”. 


Let us notice that if, at time r, pj covers REG[x\, then r necessarily lies between the last snapshot issued by p :] at line 02 
planned write W(x, X) that will occur at line [04| 05 or 08 


and its 


4.2 The relation □ is a partial order 

Lemma 1 ((X □ Y □ Z) A ( X.rd = Y.rd = Z.rd )) => (X.cflW (-^Z.cfl A X.vai = Z.val)). 

Proof Let us assume that (-> X.cfl ) holds, we have to prove -Z.cfl and X.vai = Z.val. It then follows from the lemma 
assumption and the definition of □ that we have: 

((X □ Y) A (X.rd = Y.rd ) A (~>X.cf£)) => (-Y.cfl A X.vai = Y.val). 

Hence we can use the same argument as above to show that (-> Z.cfl A Y.val = Z.val): 

((Y □ Z) A ( Y.rd = Z.rd) A (-. Y.cfl )) => (->Z.cfl A Y.val = Z.val). 

Summarizing we have (-> Z.cfl A X.vai = Z.val). This proves the claim. DLemmaQ] 

Lemma 2 □ is a partial order. 

Proof To prove the transitivity property, let us assume that X □ Y and Y □ X We have to show that X' □ Z. If X = Y or Y = Z, 
the claim follows trivially. Hence, let us assume that Y is neither X nor Z. As (X □ Y) => (X > Y), {Y □ Z) => (Y > Z), it 
follows that X > Z. To prove X □ Z, it remains to show that ((X.rd > Z.rd) V (X.cfl) V (-> Z.cfl A X.vai = Z.val)). Let us 
observe that, due to the definition of □, we have (X □ Y) => ( (X.rd > Y.rd) V (X.cfl) V (-> Z.cfl A X.vai = Y.val)). There 
are three cases. 

• Case (X.rd > Y.rd). As Y □ Z we have (Y.rd > Z.rd). Hence, (X.rd > Z.rd). 

• Case (X.rd = Y.rd) A (Y.rd > Z.rd). Then, we have (X.rd > Z.rd). 

• Case (X.rd = Y.rd) A (Y.rd = Z.rd). Then, Lemma[T]=> (X.cfl V (X Z.cfl A X.vai = Z.val)). 

In each case, the transitivity property follows. 

To prove the antisymmetry property, we show that if X □ Y then Y f\ X. Assume for contradiction that X □ Y and Y □ X. 
It follows that X > Y and Y > X, contradiction. □ [ mfl\ 


4.3 Extracting the relations □ and □ from the algorithm 

The definition of sup() appears in Figure [T] 

Lemma 3 Let T be a set of quadruplets. For every X £ T : sup(T) □ X. 

Proof Let X £ T and S = sup(T). We have to prove that S □ X. Let us first observe that, as S = sup(T) > max(T) > X', we 
have S > X. If S = X then the lemma follows immediately. So let us assume in the following that S > X. There are two cases. 

• If S.rd > X.rd, then S □ X, and the lemma follows. 


Collection des Publications Internes de l’lrisa ©IRIS A 






8 


Z. Bouzid, M. Raynal & P. Sutra 


• Assume that S.rd = X.rd. We need to show that ( S.cfl ) V (-iX.cfl A S.val = X.vai). 

In the following we prove that (-> S.cfl => X.cfi ). Therefore we need then only to show that {S.cfl) V (S.val = X.vai). 

Let us first prove (-> S.cfl => -> X.cfi). We do it by proving the contrapositive X.cfi => S.cft If (X.cfi), we have 
the following. Since X.rd = S.rd = sup (T).rd, it follows that the predicate conf lictl(T) is true, which implies that 

S. cfl = sup (T).cfl is also true. Therefore X.cfi => S.cfl. 

Let us now show the second part, i.e., either (S.cfl) or (S.val = X.vai) holds. Assume that (S.val ^ X.vai) and let us prove 
that (S.cfl) is true. Let us observe that, due to the definition of S = sup(T) (Figure^, ma x(T).val = sup (T).val = S.val. 
But we assumed S.val ^ X.vai. Therefore ma x(T).val ^ X.vai. This means that there are at least two elements in 

T, namely X and max(T’), which are associated with the maximal round S.rd, and which carry distinct values ( X.vai 

ma x(T).val). Hence, the predicate conflict2(T) is satisfied, and consequently sup (T).cfl is equal to true. Therefore 
S = Sup(T) 3 X. □LemmaE] 


Lemma 4 If Pi executes W(—, Y) at time t, then for every X £ viewj : Y □ X. 
Proof We consider two cases according to the line at which the write occurs. 


• Y is written at line 04 or 05 It follows that Y.rd = (ma x(viewj).rd) + 1. Therefore, for every X £ viewJ : Y.rd > X.rd. 
Hence Y □ A”. 


• Y is written at line 08 In this case, due to the invocation of the function sup() at line 06 the value Y written by is equal 
to sup(T) where T = {viewj [1], • • • ,viewj[n], (1, down, false, vj)}. According to Lemma[3] it follows that for every 
A' £ viewj we have Y = sup(T) □ X. 

^ Lemma ft! 


Lemma 5 Let us assume that no process is covering REG[x\ at time r. For every write W(—, A') that (a) occurs after t and (b) 
was not covering a register of REG at time t, we have X □ REG T [x]. 

Proof The proof is by contradiction. Let pi be the first process that executes a write W(—, A') contradicting the lemma. This means 
that W(—, A) is not covering a register of REG at time r and X (2 REG T \x\. Let this write occur at time t 2 > r. Thus, all writes 
that take place between r and t 2 comply with the lemma. We derive a contradiction by showing that X □ REG T [x). 

Let 7"i < 7"2 be the linearization time of the last snapshot taken by pt (line |02|i before executing W (—, A). Since W (—, A) was 
not covering a register of REG at time r, the snapshot preceding this write was necessarily taken after r. That is, t\ > r, and we 
have T 2 > ri > r. 

According to Lemma [4] A □ viewj 2 [x]. But since the snapshot returning viewj 2 is linearized at n, it follows that viewj 2 = 
REG Tl . Therefore, we have A □ R.EG Tl [x] (assertion R). 

In the following we show that REG Tl [x} □ REG T [x\. If REG[x\ was not updated between r and n, then REG Tl [x\ = 
REG T [x] and the claim follows. Otherwise, if REG[x\ was updated between r and T \, the content of REG Tl [x], let it be Y , is a 
result of a write W(x, Y) that occurred between r and n and that was not covering a register of REG at time r (remember that 
no write is covering REG[x\ at time r). We assumed above that r 2 is the first time at which the lemma is contradicted. Hence 
the write W(x, Y), which occurs before t- 2 , complies with the requirements of the lemma. It follows that Y □ REG T [x], and we 
consequently have REG Tl [x] □ REG T [x ]. 

But it was shown above (see assertion R) that A □ REG Tl [x]. Hence, due to the transitivity of the relation □ (Lemma[2]), we 
obtain A □ REG T \x\, a contradiction that concludes the proof of the lemma. □ /, emm „ [ 5 ] 


Lemma 6 Let r and t' > r be two time instants. If RE G T is T~L(Y), then there exists X £ REG T such that Y □ A. 


Proof If REG T = REG T , the lemma holds trivially. So let us assume in the following that REG T f REG T which means that 
a write happens between r and t'. If (0, down, false, _L) £ REG T , as every quadruplet Y written in REG is such that Y.rd > 1 
(line [04] [05] or lines 06|[08] >, we have Y □ (0, down, false, _L). 

So, let us assume that (0, down, false, _L) ^ REG T and consider the last write in REG before r. Assume this happens at 
t~ < t and let p, be the writing process. Process p, has no write covering a register of REG at time t~ . Consequently, at most 
(n — 1) processed have a write covering a register of REG at time r~. Hence, there exists x £ {1,..., n} such that no write is 
covering REG[x] at time t~. Let A = REG T [x] = REG T [x\. If A = Y then the claim of the lemma follows trivially. So 


Act us notice that this is the only place in the proof where the consensus version of the algorithm requires more than (n — 1) MWMR atomic registers. 
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assume in the following that X Y. Since REG T [tc] = X, REG T [x\ = Y and Y f X, there is necessarily a write W(x, Y) that 
occurred between r~ and t'. As this write was not covering a register of REG at time t ~ , it follows (according to Lemma[5]i that 
Y □ X, which proves the lemma. Ll iemma [ g] 

The following two lemmata are corollaries of Lemma|6] 

Lemma 7 If RE G T is U(X), REG T ' is H{Y), and t ' > r, then Y □ X. 

Lemma 8 If RE G T is H(X), REG T is fL{Y), t ' > t , ( Y.rd = X.rd) and {-AY. eft) then ( Y.vai = X.vaf). 

Proof According to Lemma[7] Y □ X. If Y = X then the claim follows immediately. So let us assume Y □ X. As (Y.rd = X.rd) 
and (-i Y.cft ), the definition of □ implies that Y.vai = X.vai. □ Lemma\s\ 


4.4 Exploiting homogeneous snapshots 

Lemma 9 [(X € REG T ) A (X.ivl = up)] => (3 r' < r: REG T is R(Z), where Z = ( X.rd — 1, down, false, X.vai)). 

Proof Let us first show that there is a process that writes the quadruplet X' into REG, with X' = {X.rd, X.ivi, false, X.vai). 
We have two cases depending on the value of X.cfl. 


• If X.cfi = false, then let X' = X. Since X.ivi = X'.ivt = up, X was necessarily written into REG by some process 
(let us remember that the initial value of each register of REG is (0, down, false, _L}). 

• If X. eft = true, let us consider the time t\ at which X was written for the first time into REG, say by p,. Since X.ivi = up, 
both ti and p, are well defined. This write of X happens necessarily at line 08 (If it was at line 04 or 05 we would have 
X.cfi = false). 


Therefore, X was computed at line 06 by the function sup(). Namely we have X = sup(T), where the set T is equal to 
{view T [ 1], • • ■ , view T [n ], (1, down, false, vf)}. Observe that X ^ T, otherwise X would not be written for the first time at 
Ti. Let X' = max(T). Since X T, it follows that X ^ X'. Due to line S6 of the function sup(), X and X' differ only in 
their conflict field. Therefore, as X.cfi = true, it follows that X'.cfi = false. Finally, as X'.ivt = up and all registers of 
REG are initialized to (0, down, false, _L), it follows that X' was necessarily written into REG by some process. 


In both cases, there exists a time at which a process writes X' = {X.rd, X.ivi, false, X.vai) into REG. Let us consider the 


first process pi that does so. This occurs at some time T 2 < r. As X'.ivt = up, this write can occur only at line 04 or line 08 


We show first that this write occurs necessarily at line 04 Assume for contradiction that the write of X' into REG happens 
at line 08 In this case, the quadruplet X' was computed at line 06 Therefore, X' = sup(T) where where the set T is equal 


to {view T2 [1], • • • , view T2 [n], (1, down, false, Uj)}. Observe that sup(T) and max(T) can differ only in their conflict field. As 
sup(T).c/£ = X 1 .eft = false, it follows that X' = stip(T) = max(T’). Consequently, X' € view T2 . That is, pi is not the first 
process that writes X' in REG, contradiction. Therefore, the write necessarily happens at line|04| 

■ that view T2 is 7 '-L({X'.rd — 1 , down, false, X' .vat)). Hence, the lemma follows. 

^ Lemma [ 9 ] 


It follows then from the precondition of line 


04 


Lemma 10 [( REG T is U(X)) A (X.ivi = up) A (-. X.cfi ) A ( REG T ' is U(Y)) A ( Y.rd > X.rd)} => (Y.vai = X.vai). 

Proof The proof is by induction on Y.rd. Let us first assume that Y.rd = X.rd, for which we consider two cases. 

• Case 1: r > r'. Since X.cfi = false, it follows according to Lemma[8]that Y.vai = X.vai. 

• Case 2: t' > r. According to Lemma[7] Y □ X. As Y.rd = X.rd, it follows that Y.ivi > X.ivi = up, and consequently 
Y.ivi = up. 

Summarizing we have REG T is 'H(Y), Y.ivi = up and Y.rd = X.rd. According to Lemma[9j This implies that it exists n < 
r and r( < t' such hat REG Tl is R((X.rd — 1, down, false, X.vai)) and REG Tl is 7 i((Y.rd — 1, down, false, Y.vai)). 
According to Lemma |7| we have either ( X.rd — 1, down, false, X.vai) □ (Z.rd — 1, down, false, Y.vai) or (Y.rd — 
1, down, false, Y.vai) □ (X.rd — 1, down, false, X.vai). Since by assumption X.rd = Y.rd, it follows that X.vai = 
Y.vai. The contradiction establishes the claim. 
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For the induction step, let assume that the lemma is true up to Y.rd = p> r, and let us prove it for p + 1. To this end, we have to 
show that Y.val = X.val for every Y that is written in REG with Y.rd — p+ 1. Let us assume by contradiction that Y.val ^ X.val 
and let p, be the first process that writes (p + 1, —, —, Y.val ) into REG. This happens at line 04 or 05 In all cases, this implies 


that, at this moment, viewj is 7f((p, —, Y.val)). But, according to the induction assumption, this implies Y.vai = X.val, a 


contradiction which completes the proof of the lemma. 


□ 


Lemma llOl 


4.5 Proof of the algorithm: exploiting the previous lemmas 

Lemma 11 No two processes decide different values. 


Proof Let r be the smallest round in which a process decides, p t and val being the deciding process and the decided value, 
respectively. Therefore, there is a time r at which viewj is H({r, up, false, val)). Due to Lemma 10 every homogeneous snapshot 


starting from round r is necessarily associated with the value val. Therefore, only this value can be decided in any round higher than 
r. Since r was assumed to be the smallest round in which a decision occurs, the consensus agreement property follows. □ Lemma 1TT1 


Lemma 12 For every quadruplet X that is written in REG, X.val is a value proposed by some process. 


Proof Let us assume by contradiction that X.val = v was not proposed by a process, and let p, be the first process that writes X 
into REG. We consider two cases according to the line at which the write occurs. 


v is written into REG at line 04 or line 05 In this case, p, obtained a view of REG in which at least some register contains 
the value v. According to the predicate of these two lines, the round number associated with v is necessarily greater than 0 
which implies that v was previously written into REG and was not there initially. But this means that p, is not the first process 
which writes v into REG, a contradiction. 

v is written into REG at line [08] In this case, the quadruplet X, where X.val = v, was returned by the call of the function 
sup(), namely sup(meiu[l], • • • , view[n], (1, down, false, vf)), from which it follows that v is either Vi (the proposal of pj) 
or some value that was previously written by another process. But, by assumption, pi is assumed to be the first process to write 
v. Hence, v = v t , which concludes the proof of the lemma. n , 

L/CTYlTnCL 


Lemma 13 A decided value is a proposed value. 


Proof If a process decides a value v, it does it at line 03 Hence, according to the predicate of line 03 the round number associated 
with this value is greater than 0 which means that v was necessarily written into REG by some process. It then follows from 

P Lemma [T3l 


Lemma 12 that v was proposed by a process, which establishes the claim. 


Lemma 14 Let T be a set of quadruplets. For every TCT: sup(T' U {sup(T)}) = sup(T). 

Proof Let S = sup(T). Hence S.rd is the highest round number in T. Moreover, S is greater than, or equal to, any quadru¬ 
plet in T. Hence, max(T" U {5}) = S. Therefore, combined with the the definition of sup(), we have: sup(T" U {5}) = 
(S.rd, S.lvl, conf lict(T' U {S'}), S.val). Thus, in order to prove that sup(T' U {S}) = S, we need to show that conf lict(T' U 
{S}) = S.cfl. There are two cases depending on the value of S.cfl. 

• S.cfl = true. 

In this case, conf lict\({S}) = true. But S.rd is the highest round number in T from which it follows that S.rd is also the 
highest in T' U {S}. Therefore, conf lictl({S}) = true implies that conf lictl(T' U {S}) = true. 

• S.cfl = false. 

Since S = sup(T), it follows that conflict(T) = false. Consequently, both conflictl(T) and confUct2(T) are false. 
Moreover, as S.cfl = false, it follows that conflictl({S}) = false. Therefore conflictl(T U {S}) = false. But, as 
T' C T, this yields conflictl(T' U {S'}) = false. 

On another side, it follows from confHct2(T) = false that \vals(T)\ = 1. As S = sup(T), we have S.val £ vals(T). 
Therefore \vals(T U {S})| = 1. Since T' C T, it follows that | vals(T' U {S})| = 1 which implies conf Hct2(T' U {S}) = 
false. 

As both conf lictl(T' U {S}) and conf lict.2(T' U {S}) are false, it follows that conf lict(T' U {S}) = false. 
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From the case analysis we conclude that conf£ict(T' U {S'}) = S.cft. 


□ 


Lemma eh 


Lemma 15 If there is a time after which a process executes solo, it decides a value. 


Proof Assume that pi eventually runs solo, we need to show that p t decides. There exists a time r, after which no other process than 
Pi writes into REG. Let t' > r be the first time at which p, takes a snapshot after r. This snapshot is well defined, as p, runs solo 
after r and the implementation of atomic snapshot is obstmction-free. LetS = sup^viewj [1], - - - ,viewj [n], (1, down, false, t^)). 
Let us first show that there is a time after r at which REG is TT(S). 


If REG T is H(S), we are done. 

If REG T is not H(S), pi executes line 


06 


and computes S. Then it writes S in an entry of REG (containing a value different 


from S ), and re-enters the loop. If REG is then 'H(S), we are done. Otherwise, p, executes again line 06 and, due to 
Lemma 14 the quadruplet computed by the function sup() is equal to S. It follows that after a finite number of iterations of 
the loop, REG is TL(S). 


When REG is T-l(S), we have the following. 


• If S' = (—, up, false, —), pi decides in line 03 

• If S' = (r, down, false, vat), then pi writes Y = (r + 1, up, false, vat) in line 04 Using the same argument as above, there 
is a time at which REG becomes TL(Y), and the previous case holds. 

• If S = (r, — , true, vat), then pi writes Y = (r + 1, down, false, vat) in line 05 Then p r keeps writing Y in the following 
iterations until REG becomes TL(Y), and the previous case holds. 


Hence, in all cases p t eventually decides. 


□ 


Lemma \lb\ 


Lemma 16 If a single value is proposed, all correct processes decide. 

Proof Let us assume that all processes propose the same value v. It follows that all the processes keep writing X = (1, down, false, v) 
until REG becomes 'H(X). Then, once every register of REG has been updated at least once, the processes start writing Y = 
(2, up, false, v) until REG becomes TL(Y) and v. When this occurs, v is decided. □ Lemma Util 


Theorem 1 The algorithm described in Figure^solves the obstruction-free consensus problem (as defined in Section 2.21. 

Proof The proof follows directly from the Lemma (Agreement), Lemma [13] (Validity), Lemma |T5] (OB-Termination), and 
Lemma [16] (S V-Termination). □ Theorem Q] 


5 From Consensus to (n, k )-Set Agreement 

The algorithm The obstruction-free (n, k )-set agreement algorithm is the same as the one of Figure [2] except that now there are 
only to = n — k + 1 MWMR atomic registers instead of to = n. Hence REG is now REG[l..(n — k + 1)]. 


Its correctness The arguments for the validity and liveness properties are the same as the ones of the consensus algorithm since 
they do not depend on the size of the memory REG. 

As far as the fc-set agreement property is concerned (no more than k different values can be decided), we have to show that 
(n — k + 1) registers are sufficient. To this end, let us consider the (k — 1) first decided values, where the notion “first” is defined with 
respect to the linearization time of the snapshot invocation (line [02} that immediately precedes the invocation of the corresponding 
deciding statement (return () at line 04 1 . Let r be the time just after the linearization of these (k — 1) “deciding” snapshots. Starting 
from r, at most (n— (k — 1)) = (n — fc+1) processes access the array REG, which is made up of exactly (n—k+1) registers. Hence, 
after r, these (n ■ k + 1) processes execute the consensus algorithm of Figure[2] where (n—k + 1) replaces n, and consequently 
at most one new value is decided. Therefore, at most k values are decided by the n processes. 
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6 From One-shot to Repeated (n, k )-Set Agreement 
6.1 The repeated (n, A:)-set agreement problem 

In the repeated (n, fc)-set agreement problem, the processes executes a sequence of (n, k)- set agreement instances. Hence, a process 
Pi invokes sequentially the operation propose( 1 , vt), then propose(2, Vi), etc., where srii = 1 , 2, ... is the sequence number of its 
current instance, and v, the value it proposes to this instance. 

It would be possible to associate a specific instance of the base algorithm described in Figure [2] with each sequence number, but 
this would require (n — k + 1) atomic read/write registers per instance. The next section that, it is possible to solve the repeated 
problem with only (n — k + 1) atomic registers. According to the complexity results of J9), it follows that this algorithm is optimal 
in the number of atomic registers, which consequently closes the lower/upper bounds discussion associated with repeated (n, k)- set 
agreement. 


6.2 Adapting the algorithm 

From quadruplets to sixuplets Instead of a quadruplet, an atomic read/write register is now a sixuplet X = ( sn , rd. tvl, eft, vat. dcd). 
The four fields X.rd, X.lvl, X.cft, X.vai are the same as before. The new field X.sn contains a sequence number, while the new 
field X.dcd is an initially empty list. From a notational point of view, the jth element of this list is denoted X.dcd[j]\ it contains a 
value decided by the yth instance of the repeated (n, fc)-set agreement. 

The total order on sixuplets “>” is the classical lexicographical order defined on its first five fields while the relation is now 
defined as follows: 

Xz]Y d = (X > Y ) A [(X.sn > Y.sn ) V ( X.rd > Y.rd) V {X.cft) V {-Y.cft A X.vai = Y.vat)\. 


Local variables Each process p, has now to manage two local variables whose scope is the whole repeated (n, A:)-set agreement 
problem. 

• The variable srii, initialized to 0, is used by p, to generate its sequence numbers. It is assumed that p, increases srii before 
invoking propose(s7Ti, Vi). 

• The local list dcdi is used by p, to store the value it has decided during the previous instances of the (n. fc)-set agreement. 
Hence, dcdi \j\ contains the value decided by p t during the jth instance. 


The algorithm The algorithm executed by a process pi is described in Figure [3] The parts which are new with respect to the base 
algorithm of Figure [2] are in red. 


operation propose(s7ii, Vi) is 
(01) repeat forever 

(02) view t— REG. snapshot(); 

(03) case (Vx : view[x\ = (.sn, , r, up, false, vat. —) where r > 0) then dcd., [.sn,] <— vat, retur r\(val) 

(04) (Vx : view[x] = ( sni,r , down, false, val , —) where r > 0) then REG. write(l, (srii,r + 1, up, false, val , dcdi)) 

(05) (Vx : view[x] = (srii, r, level, true, va£, —) where r > 0) REG. write(l, (srii, r + 1, down, false, val, dcdi)) 

(06) otherwise let (inst, r, level, conflict, val, dec) <— sup(uiein[l], • • ■ , view\n], (sni, 1, down, false, , dcdi)): 

(07) if (inst > sni) then d.rA, [.sn,] <— dec[sni ]; return dcdi[sni ] end if 

(08) x <— smallest index such that view [x] = m'm(view[l], ■ ■ ■ ,view[n]): 

(09) REG. write(x, (inst, r, level, conflict, val, dec)) 

(10) end case 

(11) end repeat. 


Figure 3: Repeated obstruction-free Consensus 


Line [03] When all entries of a view obtained by p, contain only sixuplets whose the first five fields are equal, pi decide the 
value val. But before returning vat. Pi writes it in dcdi[srii\. Hence, when p t will execute the next (n,k)- set agreement 
instance (whose occurrence number will be srii + 1), it will be able to help processes, whose current sequence number sn' are 
smaller than sni, decide a value returned by the instance sn' of the repeated (n, fc)-set agreement. 

Line 04 In this case, pi obtains a view whose five first entries are equal to ( sni,r, down, false, vat). It then writes in REG[ 1] 
the sixuplet {sni, A down, false, val, dcdi). Let us notice that the write of dcdi is to help other processes decides in (n, k)- set 
agreement instances whose sequence number is smaller than sni. 

Line [05] This case is similar to the previous one. 
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Lines [06p0] In this case, pi computes the supremum of the snapshot value view obtained at line [03] plus the qsixuplet 
(srii, 1, down, false, val , dcdf). There are two cases. 


- If the sequence number of this supremum inst is greater than srii (line|07j), p, can benefit from the list of values already 
decided in in, k)- set agreement instances whose sequence number is smaller than inst. This help is obtained from 
dec\sni}. Consequently, similarly to line 03 pi writes this value in dcdfsni] and decides it. 


- If inst = sni, pt executes as in the base algorithm (lines 08]|09|). 


Hence, solving repeated in, k)- set agreement in an anonymous system does not require more atomic read/write registers than the base 
non-repeated version. The only additional cost lies in the size of the atomic registers which contain two supplementary unbounded 
fields. As already indicated, it follows from the lower bound established in J9) that this algorithm is optimal with respect to the 
number of underlying atomic registers. 


7 From Obstruction-Freedom to x-Obstruction-Freedom 

This section extends the base algorithm to obtain an algorithm that solves the x-obstruction-free (n, A:)-set agreement problem. Let 

x < k 0. 

One-shot a; -obstruction-freedom This progress condition, introduced in f30l [3lJ, is a natural generalization of obstruction- 
freedom, which corresponds to the case x = 1. 

x-Obstruction-freedom guarantees that, for every set of processes P, P\ < x, every correct process in P returns from its 
operation invocation if no process outside P takes steps for “long enough”. It is easy to see that x-obstruction-freedom and wait- 
freedom are equivalent in any n-process system where x > n. Differently, when x < n, x-obstruction-freedom depends on the 
concurrency pattern while wait-freedom does not. 


.t-Obstruction-free (n. fcj-set agreement: OB-Termination When considering x-obstruction-freedom, the Validity, Agreement 
and SV-Termination properties defining obstruction-free (n, fcj-set agreement are the same as the ones stated in Section 2.2 The 
only property that must be adapted is OB-Termination, which becomes: 


• x-OB-termination. If there is a time after which at most x correct processes execute concurrently, each of these processes 
eventually decides a value. 


The shared memory REG To cope with the x-concurrency allowed by obstruction-freedom, the array REG is such that it has 
now m = n — k + x entries (i.e., m = n — k + 1) entries for the base obstruction-freedom). This increase in the size of the array is 
due to the fact that the algorithm is required to terminate in more scenarios than simple obstruction-freedom. 


Content of a quadruplet In the base algorithm, the four fields of a quadruplet A' are a round number X.rd, a level Xlvl, a 
conflict value X.cfi, and a value X.val. Coping with x-concurrency requires to replace the last field, which was made up of a single 
X.val, by a set of values denoted X.valset. 


function sup(T) is % S is a set of quadruplets, the last field of each of them is now a set of values % 

(SI’) let (r, tevet, eft, valset) be max(T); % lexicographical order % 

(S2') let vals(T) be {u | (r, —, valset) S T A v £ valset}', 

(S3) let conf tictUT) be 3 (r, —, conflict , —) E T; % conflict inherited % 

(S4’) let conf£ict2(T) be \vals(T)\ > x; % conflict discovered % 

(S5) let conf iict(T) be conflictl(T) V conf£ict2(T)' 

(N) if confiict(T) then vals'{T) t— valset else vals'(T) <— the set of the (at most) x greatest values in vals(T) end if; 
(S6’) return((r, tevet, conflict(T),vals'(T))f 


Figure 4: Function sup() suited to x-obstruction-freedom 


6 This assumption is a necessary requirement to solve (n, fcj-set agreement in a read/write system. It follows from the impossibility result stating that (n, fcj-set 
agreement cannot be wait-free solved for n > fc, when any number of processes may crash j4| ll8ll27 1. 
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The modified function sup() Coping with x-concurrency requires to also adapt the function sup(). This function sup() is a simple 
extension of the base version described in Figure |T] that allows to consider a set of values instead of a single value. It is described in 
Figure[4] The lines that are modified (with respect to the base function sup()) are followed by a “prime”, and a new line (marked N) 
is added. More precisely, the modifications are the following. 

• Line SI’. The last held of a quadruplet is now a set of values, denoted valset. As far as the lexicographical ordering is 
concerned, the sets valset are ordered as follows. They are ordered by size, and sets of the same size are ordered from their 
greatest to their smallest element. 

• Line S2’. The set vals{T ) is now the union of all the valset associated with the greatest round number appearing in T. 

• Lines S3 and S5: not modified. 

• Line S4’. conf £ict2(T) is modified to take into account ^-concurrency. A conflict is now discovered when more than x 
(instead of 1) values are associated with the round number of the maximal element of T. 

• New line N. The set vals'iT ) is equal to valset if conflict(T) = true. Otherwise, it contains the (at most) x greatest values 
of vals(T). 

• Line S6’. The quadruplet returned by sup(T) differs from the one of Figure|2]in its last held which is now the set vals'(T). 

It is easy to see that, when the last held of the quadruplets is reduced to singleton, and x = 1, this extended version boils down 
to the one described in Figure [2] 


operation propose^ ) is 



(01) 

Q 4 — (1, down, false, 



(02) 

repeat forever 



(03) 

view <— REG. snapshot^); 



(04) 

case (Wx : view[x] = Q = (r, up, false, valset) 

where r > 0) 

then return any value in valset 

(05) 

(\/x : view[x] = Q = (r, down, false, valset) 

where r > 0) 

then Q 4— (r + 1, up, false, valset ); REG. write(l, Q) 

(06) 

(\/x : view[x] = Q = (r, leved, true, valset) 

where r > 0) 

then let v be any value in valset ; 

(07) 



Q (r + 1, down, false, {"u}); REG. write(l, Q); 

(08) 

otherwise let Q sup(meu;[l], • • • , view[n], Q); 


(09) 

x <— smallest index such that view [rr] 

7 ^ Q ; 


(10) 

REG. write(ai, Q ) 



(11) 

end case 



(12) end repeat. 




Figure 5: Anonymous x-obstruction-free Consensus 


x-Obstruction-free (n, k)- set agreement: algorithm An algorithm extending the base obstruction-free algorithm of Figure[2]to 
an x-obstruction-free (n, k)- set agreement algorithm is described in Figure [2] (Let us remember that, as the underlying snapshot al¬ 
gorithm is non-blocking lfl5l . it ensures that -whatever the concurrency pattern- at least one snapshot invocation always terminates.) 
This algorithm solving the x-obstruction-free (n, fc)-set agreement problem is obtained as follows, where (as already indicated) the 
array REG is composed of m = n — k + x atomic read/write registers. 

• The relation introduced in Section |4~T| is extended to take into account the fact that the last held of a quadruplet is now a 
non-empty set of values. It becomes: 


X □ Y d = (X > Y) A [{X.rd > Y.rd ) V ( X.cfl ) V ( ~<Y.cf£ A X.valset 3 Y.valset)]. 


• Each process pi maintains a local quadruplet denoted Q, containing the last quadruplet it has computed. Initially, Q is equal 
to (1, down, false, {v,}) (line|0l]p1 

This quadruplet allows its owner pi to have an order on the all the quadruplets it champions during the execution of propose)/;,;). 
Hence, if pt champions Q at time r, and champions Q' at time r' > r, we have Q' □ Q. This is to ensure the x-OB-termination 
property. 


The meaning of the three predicates at lines |04||()6} is the following. All entries of view are the same and are equal to Q, where 
the content of Q is either (r, up, false, valset), or (r, down , false, valset ), or (r, ievel, true, valset). Hence, according 
to the terminology of the proof of the base algorithm, introduced in Section 4.1 view is homogeneous, i.e., view is HXX) 
where Q obeys some predefined pattern. 


7 Let us notice that, the algorithm has no longer the memoryless property of the base algorithm. 
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• Lemma [T0| needs to be re-formulated to take into account the set field of each quadruplet. It becomes: 

[(REG T is U(X)) A ( X.lvi = up) A (~^X.cfl) A {REG T ' is %{Y)) A (Y.rd > X.rd )] => ( Y.valset D 
X.valset V X.valset D Y.valset). 


The lemma is true if the number of participating processes does not exceed the number of available registers in REG. 

As far the fc-set agreement property (no more than k different values can be decided), we have to show that (n — k+x) registers 
are sufficient. The reasoning is similar to one done at the end of Section [5] More precisely, let us consider the (/.: — x) first 
decided values, where the notion “first” is defined with respect to the linearization time of the snapshot invocation (line [02| 
that immediately precedes the invocation of the corresponding deciding statement (return () at line 04 1 . Let r be the time just 
after the linearization of these (k — x) “deciding” snapshots. Starting from t, at most (n — ( k — x)) = (n — k + x) processes 
access the array REG, which is made up of exactly (n — k + x) registers. Consider the (k — x + l)-th deciding snapshot, let 
it be at t' > r. According to the precondition of line ( 


03 


REG T is TL(X) for some A' with X.lvi = up and X.cfl = f als 


Observe that \X.valset\ < x. 

According to the new statement of Lemma[l0j since starting from r the number of participating processes is always less than 
the number of registers, then all deciding snapshots after t' are associated with a set of values that is either a subset or a 
superset of X.valset. Hence, at most x values can be decided starting from t'. 

As far as x-OB-termination is concerned, the key is line 07 When a process detects a conflict ( Q.cfl = true, at line [06]>, 
it starts a new round with a set which is a singleton. Hence, if there is a finite time after which no more than x processes are 
executing, there is a finite round from which at most x values survive and appear in the next round. From that round, no new 
conflict can be discovered, and eventually the (at most) x running processes obtain snapshots entailing decision. 


8 Conclusion 

This paper presented first a base a one-shot obstruction-free (n, A:)-set agreement algorithm for a system made up of n asynchronous 
and anonymous processes, which communicate through atomic read/write registers. This algorithm requires only (n — k + 1) 
such registers. From this cost point of view, it is the best algorithm known so far (the best previously known algorithm requires 
2 (n — k) - 1 atomic read/write registers). Hence, this algorithm answers the challenge posed in 0, and establishes a new upper 
bound of (n — k + 1) on the number of registers to solve the one-shot obstruction-free (n, k)- set agreement problem. This upper 
bound improves the ones stated in 0 for anonymous and non-anonymous systems. 

A simple extension of the previous algorithm has then been presented, that solves the repeated (n, A')-set agreement problem. 
While the lower bound of (n — k + 1) atomic registers was established in 0 for this problem, the proposed algorithm shows that the 
upper bound is also equal (n — k + 1), and consequently the proposed algorithm is optimal. The paper has also generalized the base 
one-shot algorithm to solve the ( n , A: J-set agreement problem in the context of x- obs true tion- fre edom. The corresponding algorithm 
reduces to (n — k + x) the upper bound on the number of atomic read/write registers. 

To attain these goals the algorithms, which have been presented in an incremental way, rely on a simple round-based structure. 
Moreover, the base one-shot algorithm does not require persistent local variables, and, in addition to a proposed value, an atomic 
register contains only two bits and a round number. The algorithm solving the repeated ( n , fc j-set agreement problem requires that 
each atomic register includes two more integers. 

Let us call “MWMR-nA” of a problem P, the minimal number of MWMR atomic registers needed to solve P in an asynchronous 
system of n processes. The paper has shown that (n - k + 1) is the M W MR-nA of repeated obstruction-free (n, k)- set agreement. 
We conjecture that (n — k + 1) is also the MWMR-nA of one-shot obstruction-free (n, A:)-set agreement, and more generally that 
(n — k + x) is the MWMR-nA of one-shot x-obstruction-free (n, k)- set agreement, when 1 < x < k < n. 
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A Non-blocking snapshot object 

This appendix presents a non-blocking (hence obstruction-free) snapshot object which uses no additional atomic register. The idea 
that underlies this algorithm, which is due to Guerraoui and Ruppert OS), is simple. The algorithm, described in Figure [6] considers 
that the n anonymous processes share m underlying MWMR atomic registers. 


Shared variables 

SM[l..m\: array of n multivalued MWMR atomic registers, initially [(—, _L), • • • , , _L)]; 

SM[x] = ( SM[x].ts , SM[x],value)', only SM[i],value can be made visible outside. 

Permanent local variable: each process pi manages a counter tsi, initialized to 0. 

operation write(:r, u) is % issued by pi % 

(01) SM[x] 4— ( tsi,v ); tsi 4— tsi + 1; return(). 

operation snapshot() is 

(02) count 1; for each x £ {1,..., m} do sml[x] 4— SM[x] end for; 

(03) repeat forever 

(04) for each y £ {1,..., m} do sm2[y\ 4— SM[y] end for; 

(05) if (V x £ {1, • • • , m} : sml[x] = sm2[x]) 

(06) then count 4— count + 1; 

(07) if ( count = m(n — 1) + 2) then return(sml[l..m].vaZiie) end if 

(08) else count 4— 1 
(09) end if; 

(10) sml[l..m] 4— sm2[l..m] 

(11) end repeat. 


Figure 6: Obstruction-free snapshot object lff5l 

Each process p, manages an integer local variable ts, , that it uses to associate a sequence number to its successive write operations 
into any atomic register SM [x\ (line[6|. 

When a process invokes snapshotQ, it repeatedly reads the array SM\l..m\ until it obtains an array value sm[l..m] that does 
not change during (m(n — 1) + 2) readings of SM[l..m\. When this occurs, the invoking process returns the corresponding array 
value sto[1..to]. 

Trivially, any write operation terminates. As far the snapshot operation is concerned, it is easy to see that, if there is a time after 
which a process executes alone it terminates its snapshot operation, hence the implementation is obstruction-free. 

To show that it is non-blocking, let us assume that a process invokes repeatedly REG[x].wr\te() (whatever x) followed by 
REG. snapshotQ (as it is the case in the algorithms presented in the paper). An invocation of 7275(7.snapshotQ can be prevented 
from terminating only if processes issue permanently invocations of writeQ, Let us assume that no invocation of REG. snapshotQ 
terminates. This means that there are processes that permanently issue write operations. But this contradicts the assumption that 
each processes alternates invocations of REG[x\. writeQ (whatever x) and REG. snapshotQ. This is because, between two writes 
issued by a same process, this process invoked REG. snapshotQ, and consequently this snapshot invocation terminated. 

As far the linearization of the operations writeQ and snapshotQ invoked by the processes is concerned we have the following 
(this proof is from El). Let us consider an invocation of snapshotQ that terminates. It has seen m(n — 1) + 2 times the same vector 
sm[l..m) in the array SM[l..m\. Since a given pair ( ts , v) can be written at most once by a process, it can be written at most (n — 1) 
times during a snapshot (once by each process, except the one invoking the snapshot). It follows that, among the m(n — 1) + 2 times 
where the same vector sm[l..m] was read from SM[l..m\, there are least two consecutive reads during which no process wrote a 
register. The snapshot invocation is consequently linearized after the first of these two reads. 


B All Correct Processes Decide if One Process Decides 

This appendix shows that, by adding one MWMR register, the consensus termination property can be strengthened. More precisely, 
we have then the additional termination property (where OA stands for “One-All”). 

• OA-termination. If a process decides, all correct processes decide. 

Let DEC be the additional register, initialized to the default value _L. The extended algorithm is the one described in Ligure[2] 
with only two modifications. 

• The first modification is the addition of the new line 

if (DEC ^ _L) then return (va£) end if 
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between line [01] and line [02] Each time it enters the repeat loop, a process first checks if a value was previously decided. If it 
is the case, it decides it. 


The first modification is the addition, at line 04 of the statement “DEC •<— vat’, just before the statement “return 
When a process is about to decide, it first writes the decided value in the MWMR atomic register DEC. 


Theorem 2 The extended algorithm solves the obstruction-free consensus problem satisfying the additional OA-termination prop¬ 
erty, with [n + 1) underlying MWMR atomic registers. 

Proof The proof follows directly from the proof of the base algorithm of Figure[2](OB-termination and S V-termination) and the fact 
that no process can block while executing the repeat loop (hence OB-termination => OA-termination). n rfieorem || 
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